Employing machine learning algorithms to detect unknown scanning and email worms

نویسندگان

  • Shubair Abdulla
  • Sureswaran Ramadass
  • Altyeb Altaher Altyeb
  • Amer Al-Nassiri
چکیده

We present a worm detection system that leverages the reliability of IP-Flow and the effectiveness of learning machines. Typically, a host infected by a scanning or an email worm initiates a significant amount of traffic that does not rely on DNS to translate names into numeric IP addresses. Based on this fact, we capture and classify NetFlow records to extract feature patterns for each PC on the network within a certain period of time. A feature pattern includes: No of DNS requests, no of DNS responses, no of DNS normals, and no of DNS anomalies. Two learning machines are used, K-Nearest Neighbors (KNN) and Naive Bayes (NB), for the purpose of classification. Solid statistical tests, the cross-validation and paired t-test, are conducted to compare the individual performance between the KNN and NB algorithms. We used the classification accuracy, false alarm rates, and training time as metrics of performance to conclude which algorithm is superior to another. The data set used in training and testing the algorithms is created by using 18 real-life worm variants along with a big amount of benign

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detection of unknown computer worms based on behavioral classification of the host

Machine learning techniques are widely used in many fields. One of the applications of machine learning in the field of the information security is classification of a computer behavior into malicious and benign. Anti viruses consisting on signature-based methods are helpless against new (unknown) computer worms. This paper focuses on the feasibility of accurately detecting unknown worm activit...

متن کامل

A Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization

Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...

متن کامل

Application of ensemble learning techniques to model the atmospheric concentration of SO2

In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...

متن کامل

Email Worm Detection by Flow Level Data Mining DNS Query Streams

Email worms remain a major network security concern, as they increasingly attack systems with intensity using more advanced social engineering tricks. Their extremely high prevalence clearly indicates that current network defence mechanisms are intrinsically incapable of mitigating email worms, and thereby reducing unwanted email traffic traversing the Internet. In this paper we study the effec...

متن کامل

Comparative Analysis of Machine Learning Algorithms with Optimization Purposes

The field of optimization and machine learning are increasingly interplayed and optimization in different problems leads to the use of machine learning approaches‎. ‎Machine learning algorithms work in reasonable computational time for specific classes of problems and have important role in extracting knowledge from large amount of data‎. ‎In this paper‎, ‎a methodology has been employed to opt...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Int. Arab J. Inf. Technol.

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2014